Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-1260]: faster construction of features with intercept #161

Closed
wants to merge 2 commits into from

Conversation

mengxr
Copy link
Contributor

@mengxr mengxr commented Mar 17, 2014

The current implementation uses Array(1.0, features: _*) to construct a new array with intercept. This is not efficient for big arrays because Array.apply uses a for loop that iterates over the arguments. Array.+: is a better choice here.

Also, I don't see a reason to set initial weights to ones. So I set them to zeros.

JIRA: https://spark-project.atlassian.net/browse/SPARK-1260

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13213/

@rxin
Copy link
Contributor

rxin commented Mar 18, 2014

Thanks. I've merged this!

@asfgit asfgit closed this in e108b9a Mar 18, 2014
@mengxr mengxr deleted the sgd branch March 18, 2014 22:40
mengxr added a commit to mengxr/spark that referenced this pull request Mar 19, 2014
The current implementation uses `Array(1.0, features: _*)` to construct a new array with intercept. This is not efficient for big arrays because `Array.apply` uses a for loop that iterates over the arguments. `Array.+:` is a better choice here.

Also, I don't see a reason to set initial weights to ones. So I set them to zeros.

JIRA: https://spark-project.atlassian.net/browse/SPARK-1260

Author: Xiangrui Meng <meng@databricks.com>

Closes apache#161 from mengxr/sgd and squashes the following commits:

b5cfc53 [Xiangrui Meng] set default weights to zeros
a1439c2 [Xiangrui Meng] faster construction of features with intercept
ericl pushed a commit to ericl/spark that referenced this pull request Jan 23, 2017
## What changes were proposed in this pull request?

This PR adds a new project `ql-kafka-0-8` to support Kafka 0.8 for Structured Streaming. It follows the design of Kafka 0.10 source except:
- Don't support `subscribePattern`. Because without the 0.10 Kafka APIs, we need to ask all topics from Zookeeper and filter topics by ourselves.
- Don't support `failOnDataLoss` option. It means that the user cannot delete topics, otherwise the query will fail.

In addition, comparing to DStream Kafka 0.8 source, it has the following addition feature:
- Support discovering new partitions of a topic if the user uses `subscribe` option.

## How was this patch tested?

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Author: Shixiong Zhu <shixiong@databricks.com>

Closes apache#161 from zsxwing/kafka08.
ericl pushed a commit to ericl/spark that referenced this pull request Jan 23, 2017
## What changes were proposed in this pull request?

A follow up PR for apache#161 to disallow unsupported options.

## How was this patch tested?

`test("unsupported options")`

Author: Shixiong Zhu <shixiong@databricks.com>

Closes apache#169 from zsxwing/kafka08-errors.
ash211 referenced this pull request in palantir/spark Mar 3, 2017
* Allow setting memory on the driver submission server.

* Address comments

* Address comments

(cherry picked from commit f6823f3)
lins05 pushed a commit to lins05/spark that referenced this pull request Apr 23, 2017
* Allow setting memory on the driver submission server.

* Address comments

* Address comments
erikerlandson pushed a commit to erikerlandson/spark that referenced this pull request Jul 28, 2017
* Allow setting memory on the driver submission server.

* Address comments

* Address comments
yoonlee95 pushed a commit to yoonlee95/spark that referenced this pull request Aug 17, 2017
YSPARK-713: Made changes to spark-env-gen.sh to resolve keystore and truststore url on QE cluster
jlopezmalla pushed a commit to jlopezmalla/spark that referenced this pull request Feb 27, 2018
Igosuki pushed a commit to Adikteev/spark that referenced this pull request Jul 31, 2018
…emporary disconnection between driver and Mesos master. (apache#161)
bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019
Do refactor for Ansible jobs to keep struct consistent
microbearz pushed a commit to microbearz/spark that referenced this pull request Dec 15, 2020
microbearz pushed a commit to microbearz/spark that referenced this pull request Dec 15, 2020
* Revert "release r49 (apache#162)"

This reverts commit 62da28f.

* Revert "release r48 (apache#161)"

This reverts commit 1441531.

* revert ae release r52

* Revert "release r48 (apache#161)"

This reverts commit 1441531.

Co-authored-by: 7mming7 <7mming7@gmail.com>
microbearz pushed a commit to microbearz/spark that referenced this pull request Dec 15, 2020
* Revert "release r49 (apache#162)"

This reverts commit 62da28f.

* Revert "release r48 (apache#161)"

This reverts commit 1441531.

* revert ae skew release r51

Co-authored-by: 7mming7 <7mming7@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants